Mismatched Training and Test Distributions Can Outperform Matched Ones

نویسندگان

  • Carlos R. González
  • Yaser S. Abu-Mostafa
چکیده

In learning theory, the training and test sets are assumed to be drawn from the same probability distribution. This assumption is also followed in practical situations, where matching the training and test distributions is considered desirable. Contrary to conventional wisdom, we show that mismatched training and test distributions in supervised learning can in fact outperform matched distributions in terms of the bottom line, the out-of-sample performance, independent of the target function in question. This surprising result has theoretical and algorithmic ramifications that we discuss.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Handset-dependent background models for robust text-independent speaker recognition

This paper studies the e ects of handset distortion on telephone-based speaker recognition performance, resulting in the following observations: (1) the major factor in speaker recognition errors is whether the handset type (e.g., electret, carbon) is di erent across training and testing, not whether the telephone lines are mismatched, (2) the distribution of speaker recognition scores for true...

متن کامل

Free-energy Based Scoring for Speech Recognition

Traditionally, speech recognizers have used a strictly Bayesian paradigm for finding the best hypothesis from amongst all possible hypotheses for the data at hand that is to be recognized. In fact, the Bayes classification rule has been shown to be optimal when the class distributions represent the true distributions of the data to be classified. In reality, however, this condition is not satis...

متن کامل

Statistical properties of infant-directed versus adult-directed speech: insights from speech recognition.

Previous studies have shown that infant-directed speech ('motherese') exhibits overemphasized acoustic properties which may facilitate the acquisition of phonetic categories by infant learners. It has been suggested that the use of infant-directed data for training automatic speech recognition systems might also enhance the automatic learning and discrimination of phonetic categories. This stud...

متن کامل

Impact of noise reduction and spectrum estimation on noise robust speaker identification

Many spectrum estimation methods and speech enhancement algorithms have previously been evaluated for noise-robust speaker identification (SID). However, these techniques have mostly been evaluated over artificially noised, mismatched training tasks with GMM-UBM speaker models. It is therefore unclear whether performance improvements observed with these methods translate to a broader range of n...

متن کامل

Key Rate Available from Mismatched Mesurements in the BB 84 Protocol and the Uncertainty Principle ∗

SUMMARY We consider the mismatched measurements in the BB84 quantum key distribution protocol, in which measuring bases are different from transmitting bases. We give a lower bound on the amount of a secret key that can be extracted from the mismatched measurements. Our lower bound shows that we can extract a secret key from the mismatched measurements with certain quantum channels, such as the...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural computation

دوره 27 2  شماره 

صفحات  -

تاریخ انتشار 2015